On the Mapping of Index Compression Techniques on CSR Information Retrieval

نویسندگان

  • Sterling Stuart Stein
  • Nazli Goharian
چکیده

Information retrieval is the selection of documents relevant to a query. Inverted index is the conventional way to store the index of the collection. Because of the large amounts of data, compression techniques are commonly used in information retrieval systems to reduce the size of the inverted index. We experimentally evaluate the result of the mapping of such techniques on the Compressed Sparse Row (CSR) information retrieval (IR). Our experimental results, using some of these compression techniques such Elais Gamma, Golomb, Interpolative, and fixed length Byte-Aligned, demonstrate that such techniques can easily be applied to compress the index in CSR IR.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Comparative Analysis of Sparse Matrix Algorithms

We evaluate and compare the storage efficiency of different sparse matrix storage formats as index structure for text collection and their corresponding sparse matrixvector multiplication algorithm to perform query processing in information retrieval (IR) application. We show the results of our implementations for several sparse matrix algorithms such as Coordinate Storage (COO), Compressed Spa...

متن کامل

BRAIN MAPPING IN NEUROSURGERY

Background and Aim: Brain mapping is a study of the anatomy and function of the CNS (central nervous system). Brain mapping has many techniques and these techniques are permanently changing and updating. From the beginning, brain mapping was invasive and for brain mapping, electrical stimulation of the exposed brain was needed. However, nowadays brain mapping does not require electrical stimula...

متن کامل

بررسی تأثیرات ریشه‌یابی در بازیابی اطلاعات در زبان فارسی

Using the language-specific behavior in information retrieval systems can improve the quality of the retrieved results significantly. Part of the word that remains after removing its affixes is called stem. Stemming process can be used for improving the relevancy of the results in information retrieval system. Different morphological variants of words (plural, past tense…) will be mapped into t...

متن کامل

Improved Skips for Faster Postings List Intersection

Information retrieval can be achieved through computerized processes by generating a list of relevant responses to a query. The document processor, matching function and query analyzer are the main components of an information retrieval system. Document retrieval system is fundamentally based on: Boolean, vector-space, probabilistic, and language models. In this paper, a new methodology for mat...

متن کامل

Improved Skips for Faster Postings List Intersection

Information retrieval can be achieved through computerized processes by generating a list of relevant responses to a query. The document processor, matching function and query analyzer are the main components of an information retrieval system. Document retrieval system is fundamentally based on: Boolean, vector-space, probabilistic, and language models. In this paper, a new methodology for mat...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003